Performance and Construction of Polar Codes on 
Symmetric Binary-Input Memoryless Channels 
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Abstract — Channel polarization is a method of constructing 
capacity achieving codes for symmetric binary-input discrete 
memoryless channels (B-DMCs) |1|. In the original paper, the 
construction complexity is exponential in the blocklength. In 
this paper, a new construction method for arbitrary symmetric 
binary memoryless channel (B-MC) with linear complexity in 
the blocklength is proposed. Furthermore, new upper bound and 
lower bound of the block error probability of polar codes are 
derived for the EEC and arbitrary symmetric B-MC, respectively. 

I. Introduction 

Channel polarization, introduced by Arikan IT], is a method 
of constructing capacity achieving codes for symmetric binary- 
input discrete memoryless channels (B-DMCs). Polar codes 
which are realized by channel polarization require only low 
encoding and decoding complexity for achieving capacity. 
Furthermore, it was shown by Arikan and Telatar |2| that the 



block error probabihty of polar codes is 0(2 



for any 



fixed /3 < i, where N is the blocklength. It is significantly fast 
since the block error probability of low-density parity-check 
(LDPC) codes is polynomial in N [3|. However, in |[ll, code 
construction with polynomial complexity is introduced only 
for the binary erasure channel (BEC). The main result of this 
paper is to show code construction with 0{N) complexity for 
arbitrary symmetric binary-input memoryless channel (B-MC). 
Furthermore, a new upper bound and a lower bound of the 
block error probability of polar codes are derived for the BEC 
and arbitrary symmetric B-MC, respectively. In Section [III 
channel polarization and polar codes introduced in |1| are 
described. In Section |III1 the construction method for arbitrary 
symmetric B-MC is shown. In Section |IV] a lower bound 
of the block error probability of polar codes is derived for 
arbitrary symmetric B-MC. In Section |V] a new upper bound 
of the block error probability of polar codes over the BEC is 
derived. In Section [Vl] some techniques for tightening bounds 
are discussed. In Section IVIll numerical calculation results are 
compared with numerical simulation results. Finally, this paper 
is concluded in Section IVIIII 

II. Preliminaries 

A. Channel polarization 

Let the blocklength N be an integer power of 2. In |[T], 
Arikan discussed channel polarization on the basis of an x 



Toshiyuki Tanaka 
Department of Systems Science 
Kyoto University 
Kyoto, 606-8501, Japan 
Email: tt@i.kyoto-u. ac.jp 



N matrix Gn, which he called the generator matrix, defined 
recursively as 

[1 0" 



:=-R2" (^^®G2"-0, G2:-F:= 



1 1 



where denotes Kronecker product and where i?2" denotes 
the so-called reverse shuffle matrix, which is a permutation 
matrix. 

For a given B-MC W : {0, 1} ^ 3^, a log-likelihood ratio 
(LLR) \og{W{y I 0)/W{y \ 1)) of is a sufficient statistic 
for estimating input x G {0,1} given output y E y. Hence, 
we can associate to a B-MC W : {0, 1} ^ R with the 
LLR of W as its output, and W' has the same performance 
as W under maximum a posteriori (MAP) decoding. In this 
paper, we deal with symmetric B-MCs defined as follows. 

Definition 1. A B-MC W : {0, 1} ^ y is said to be symmetric 
if its associated B-MC W : {0,1} —>■ M introduced above 
satisfies W'{y\0)^W'{-y\l). 

Let I{W) denote the capacity between the input and the output 
of a symmetric B-MC W. 

We consider communication over a symmetric B-MC W : 
{0,1} M. Let = {ui, U2, un) denote an A^- 
dimensional row vector, and let ul — {ui, u^+i, . . . , uj) he a 
subvector of . Let us consider a vector channel W^jv(yj^ ' 



u^) := W^{y^ 
and output yf £ '. 



,N 



Gn), with input 



,N 



e {0,1} 



N 



parallel B-DMCs W^{yf 



which is obtained by combining A^ 



Xi) via the 



operation Xi — UiGn, which should be performed in the 

(i) 

modulo-2 arithmetic. We define subchannels Wm as 



Ui) 



1 



N 



WNiy'^ I <). 



2N-1 ^ 

be random variables which 



Let [/f e {0, 1}^ and Yf e 
follow the joint probability W]\[{y^ \ u^)/2^ . The mutual 
information I{U^ ; Y-^) is split by applying the chain rule, as 



N 

/(t/f;yi^) = ^/(c/.;ri^|c/ri) 



N 

= Y^m-Y{ 



N 



ur') - m ; ui-^) 



N 



i=l 



(1) 



Arikan proved the channel polarization property, which states 
that every term in the last line of ([TJ takes a value near 
zero or one, and that since I{U^\Yl') = NI{W), the 
approximate numbers of those terms which take values near 
one and zero are NliW) and A^(l — I{W)), respectively. 
This property suggests the following approach to designing a 
capacity-achieving error-correcting code: Pick up elements of 
Ui which correspond to those subchannels with high mutual 
information /(Wj^^), and use them as the information bits. 
Non-information bits in are clamped to prespecified values. 
The values of the non-information bits are assumed to be all- 
zero in this paper, since they do not affect performance of 
resulting codes if the transmitting channel is symmetric 
Instead of choosing subchannels with high mutual information 
I{wl^^), Arikan considered another strategy of construction: 
choosing subchannels with low Bhattacharyya parameters, 
which is mentioned later in this section. 

B. Decoding 

Arikan considered successive cancellation (SC) decoding in 
order to achieve capacity with low complexity. In SC decoding, 
decoding results for the non-information bits are set to 0. The 
information bits are decoded sequentially in the ascending 
order of their indices, via maximum likelihood (ML) decoding 
of the channel . More precisely, the decoding result of i- 
th bit is 



«i=0,l 



Ui). 



(2) 



If the two likelihood values are equal, the decoder determines 
or 1 with probability 1/2. 

C. Upper bound of performance and construction 

When a set I C {1,2,..., A^} of indices of the information 
bits is fixed, the block error event, denoted by S, of the 
resulting code with SC decoding is a union over X of the 
events jv that the first bit error occurs at the j-th bit. One 
has 

Ui } —: Ai^N 

where £ {0, 1}^ denote N independent fair coin flips, 
with Ci being used as the decoding result of ui if the two 
likelihood values for ui are equal. In HI, P{Ai^N) is upper 
bounded by the Bhattacharyya parameter. 
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Hence, the block error probability is upper bounded as 

p{£) = ^(^^^) ^ E ^('^^^) ^ ^ E 4'- (3) 




Fig. 1. The decoding tree for n = 3, i = 4. A binary expansion of (i — 1) 
is Oil. Bits and 1 in the expansion correspond to check nodes and variable 
nodes, which are described as filled squares and filled circles, respectively. 
Dashed nodes and edges have already been determined to or 1 and thus 
eliminated. Thin nodes and edges are not useful for decoding for the fourth bit 
since thin degree-3 check nodes are connected to a unknown variable node. 
The leaf nodes are given messages from a channel. 



The equality is due to disjointness of {Bi,N}- The first 
inequality follows from the above-mentioned inclusion relation 
between Ai^N and Bi^N- The last inequality is valid for arbi- 
trary symmetric channels O. In particular, Z^-* — 2P{Ai^N) 
if and only if the channel is the EEC. Arikan proposed a 
method of designing a code in which one chooses X that 
minimizes the rightmost side of and called the resulting 
code a polar code. In this paper, we propose an alternative code 
construction strategy in which P{Ai.N) is directly evaluated, 
instead of z'j^\ and X that minimizes J2i£X -Pi^i,^) is 
chosen. We call the codes resulting from our strategy polar 
codes as well. 

In the rest of this paper, we use the notations Ai and 
Bi instead of Ai^N and Bi^N, respectively, by dropping the 
blocklength N, when it is evident from the context. 



III. Construction of Polar codes 

We show in this section that {P{Ai)} are regarded as decod- 
ing error probabilities of belief propagation (BP) decoding on 
tree graphs, so that they can be evaluated via density evolution. 
The Tanner graph of a polar code for n = 3 is shown in Fig. [T] 
Let us consider i-th step of SC decoding. Since u\^^ have 
been either determined as non-information bits or decoded 
in previous steps, the edges incident to these variable nodes 
are eliminated. Since wfl]^ do not affect the characteristics 

(i) 

of the channel Wpf , the degree-3 check nodes connected to 
them do not work in this stage. Hence, these check nodes and 
the edges incident to them are eliminated. Similarly, degree-3 
check nodes incident to undetermined degree- 1 variable nodes 
are also eliminated recursively. The resulting decoding graph 
for Ui is tree-like, as shown in Fig.[T] Hence, the ML decision 
(|2]i can be implemented by BP decoding on the tree graph. The 
probability P{Ai) is therefore regarded as the error probability 
of the root node of the tree graph via BP decoding, where leaf 



nodes have messages of the channel. Assume that the binary 
expansion of (« — 1) is 6„ . . . 5i, then nodes at depth t of the 
tree graph are check nodes and variable nodes if 6t = and 
bt = 1, respectively, as shown in Fig. 

m 

An LLR for i-th bit, defined as L^^\y^,u^^) 
log(iy«(yf,ul-i I 0)/W^^\y^,u\-^ \ 1)) is calculated 
recursively as 

= 2tanh-l(tanh(4y2/f/^^i?72 ® ul'-')/2) 

xtanh(i«2(yj^/2+i,^M')/2)) 

where u\ ^ and u\ ^ denote subvectors which consist of 
elements of u\ with even and odd indices, respectively, and 
where © denotes modulo-2 addition. The above updating rules 
are originally derived by Arikan 1 1 1 . 

It is well known in the field of LDPC codes that the error 
probability of the root node of a tree graph after message 
passing decoding is calculated via density evolution. For 
the analysis of the error probability of symmetric channels, 
without loss of generality, it is assumed that the all-zero 
message is transmitted. The following theorem for symmetric 
B-MC is a consequence of a well-known result in ||3] and also 
obtained for the BEC by Arikan [1]. 

Theorem 1. For a symmetric B-MC which has a density a^/ 
of LLR, it holds that P{Ai) = ^(a^Y) where 

€(a) := lim^ a{x)dx ~^ 2 J ^{x)d3!^ , 

2i i i -22 — 1 I — \ -^1 

°2N — * °N' ^2N — m a^, a I — aw 

and where -k and Wi denote the convolutions of LLR density 
functions, which are defined in [3 ], corresponding to variable 
and check nodes, respectively. 

On the basis of the availability of P{Ai)s assured by Theo- 
rem [T] we propose the following code construction procedure: 
Choose T which minimizes 

subject to \T\ = NR. The block error probability of the 
resulting codes also decays like 0(2^^ ) for any < ^ 
as in |2 |, since the upper bound of the block error probability, 
given in terms of Bhattacharyya parameters in [2], is also 
an upper bound of the block error probability of the codes 
constructed via the proposed method. 

In [11, complexity of code construction on the BEC is 
explained as O(A^logA^). However, Theorem [T| states that 

'in counting the depth we omit nodes in the tree with degree 2, because 
messages of BP are passed through such nodes unprocessed. 



the complexity of code construction, not only for the BEC 
but also for an arbitrary symmetric B-MC, is 0{N). To 
see this, let x(^) denote the complexity of calculation of 
{a^yli^i ... AT where the complexities of computations of -k 
and S are considered to be constant. Then, it is evaluated as 

X{N) = 7V + x(^y)=A^+y + ^ + -- - + l = 0{N). 

Since the complexity of selecting the A^i?-th smallest P{Ai) 
is 0{N) even in the worst case |4|, the complexity of code 
construction is 0{N). 

We would like to note that larger N requires higher- 
precision representation of messages for reliable SC decoding 
and density evolution computations. In this regard, the com- 
plexity of SC decoding discussed in |1| and the complexity 
of construction mentioned above should be understood as 
referring to the number of the arithmetics of LLRs in BP and 
the number of the convolution operations in density evolution, 
respectively, not mentioning their precision. In practice, use of 
finite-sized binning in density evolution may lead to imprecise 
upper bounds of the block error probabilities, which, however, 
still provide upper bounds relevant to SC decoding with the 
same quantization as the binning scheme. 

IV. Lower bound of the block error probability 

FOR arbitrary SYMMETRIC B-MC 

To the authors' knowledge, no lower bound of the block 
error probability of polar codes has been known. In this 
section, we introduce a lower bound for a given choice of 
information bits T. We use the following fundamental lemma. 

Lemma 1. Uiei = Ujei -^i ■ 

Proof: The direction C is trivial. Assume an event v 
belongs to Ai. If u]~^ = u]^^, v belongs to Bi. Otherwise, v 
belongs to Bj for some j < i which belongs to T. ■ 

Recalling £ = IJ immediately follows that P{£) — 
^(Uiei^i) holds. The events {Ai} are easier to deal with 
than {Bi}. Several bounds which use probabilities concerning 
{Ai} are considered in what follows. 

First, via Boole's inequality, the following lower bound is 
obtained for any S CT 

\iex / \ies / 

>^P(A)- Yl ^(An^,). (5) 

ies itj)es^.i<3 

Maximization of the lower bound (|5]l with respect to S is 
difficult since it is equivalent to the Max-Cut problem, which 
is NP-hard |5|. However, without strict optimization, one can 
obtain practically accurate lower bounds for some rates and 
channels. 

In order to obtain the lower bound (|5]l, evaluations of 
probabilities of intersections of two ^^s are required. For 
this purpose, we introduce a method which we call the joint 
density evolution. Let {Xi,Yi) and {X2,Y2) denote pairs 



of random variables which independently follow a{x,y) and 
b(a:;,y), respectively. The convolution a**b is defined as the 
joint density function of messages {X, Y) where X = X1+X2 
and Y — Yi + l2- Similarly, the convolutions a*l*lb is 
defined as the joint density function of messages {X, Y) where 
X ^ Xi + X2 and Y ^ 2 tanh"\tanh(Yi/2) tanh(y2/2)). 
The other convolutions a S*b and a fflffl b are also defined in 
the same way. 

Theorem 2. For a symmetric B-MC which has a density aw 
of LLR, the joint density a^^ of LLR for i-th bit and j-th bit 
after BP decoding is calculated recursively as 



2i,2j _ i.j 
2i-l,27 i.j 



m * a';? 



2i,27"-l i.j 



= 2Ar 



2j-l,27"-l i.j 



a^^ ffl ffl a^ 



3i{x,y) = - y)aw{x) 

where S{x) denotes the Dirac delta function. 

The probabilities P{Ai Aj), P{A^i H Aj), P{A n A^^) and 
P{A'i <^ Aj) are calculated by appropriate integrations of the 
joint density a^. Extensions of joint density evolution to 
higher order joint distributions are also possible straightfor- 
wardly. 

For the BEC, density evolution has only to evolve ex- 
pectations of erasure probabilities Q. Correspondingly, joint 
density evolution for the BEC is much simpler than that for a 
general symmetric B-MC, as follows. 

Corollary 1. For the BEC with erasure probability e, 

a%\x,y)^l/j,\0,0)Six)6{y) + 1^^^0,1)6 ix)S^{y) 

+ p^^^'(l,0)<5oo(x)<5(y) +py(l, l)<5oo(x)<5oo(y) 

where 

pf^^^O, 1) = p%\0, 1)2 + 2p'^\0, 0)p^^ (0, 1), 

pf^'\i,o) - p^'ihof + ip'^^o, oy^^(i, 0), 

p'^^'\l,l)^l~p'^^'\0,0) 

-pf^'\0,l)~pf^'\l,0), 



P2n'''\0,0) 

P2^-''''(l,0) 

pf^'^'\iA) 
pfN^'^'iOA) 



:p*x/'(o,o)2 + 2p^^(o,oyX''(o,i), 

PN'iOAf. 

:pi;^(i,i)2+2p^(i,i)pi;^'(o,i), 

: 1 



pfN'''\0,0) 

-P2A'''\0^) 



■p^'-'ihl). 



:p^(0,Of + 2p^(0,0)pi^(l,0), 
Pn' ihOf, 

:P^(1,1)2 + 2P5^(1,1)P^(1,0), 
: 1 



pfN^^'' {0,0) 

P2N'''\hO) 



-p'2n'^'\1,1) 



p^^i'2^-i(0, 1) = p^(0, If + 2p^(0, 1), 
0) = p^(l,0)2 + 2p];/(l, 0)p^(l, 1), 



P2N 

2i- 
P2N 



p}''(0,0) = e, 
p}''(l,0) = 0. 



P^'(0,1) = 0, 
p\\lA)^l-e 



and where Sac (x) denotes the Dirac delta function of unit mass 
at infinity. 

Higher-order joint probabilities such as P{Ai D Aj fl ^fc) 
are calculated recursively by tracking real vectors of an 
appropriate dimension in a way similar to that described in 
Corollary □ 

Complexity of computations (as measured in numbers of 
convolution operations) of all a^^s is 0{N'^) as N increases. 
Similarly, complexity of computations of all s-joint densities 
a^'' '^^s is 0{N'^) as N increases. On the other hand, the 
complexity grows exponentially in s since the dimension of 
the densities is s. 



V. New upper bound of the block error 

PROBABILITY FOR THE BEC 

The upper bound (HJi of the block error probability of polar 
codes may yield poor results. In particular, it exceeds one near 
the capacity, as observed in 1 1 1 for the BEC. In this section, 
a new upper bound which does not exceed one is derived for 
the BEC. For the BEC, covariances among complements of 
{Ai], denoted by {^^}, are always positive. 

Lemma 2. For the BEC, Pi^.^^AD > ni6i^(A^). for 
anylC {1,...,N}. 

Outline of proof: An event A'l is expressed as 
Hfc {si,k ^ where et^k is an error pattern of erasure 
messages for i-th bit, and where is a set of indices of erasure 
messages. ■ 
Using this property, the block error probability is upper 
bounded simply by 1 — Hiei ^ i-^i)- Furthermore, it is more 
accurately upper bounded by 



iei 



(6) 



where p{i) G X corresponds to a parent node of Ai in a 
spanning tree of a perfect graph which has nodes correspond- 
ing to indices in T. P{A\ \ -4^^^^) is calculated via joint 
density evolution. In order to tighten the upper bound (|6]l, 
the maximum weight directed spanning tree should be chosen 
from the perfect directed graph whose edges have weights 
P(v4^ I -^j)' where i and j are sink and source nodes of the 
directed edge, respectively, like Chow-Liu tree JS]. 



VI. Techniques for tightening bounds 
In this section, some techniques for obtaining tighter bounds 
of f (Uiei^i) shown. The first one is applicable to polar 
codes over the BEC. In this case, the LLR l'^j^ {y^ ^u^^) 
for i-th bit in SC decoding, when the all-zero information is 
transmitted, is either zero or infinity. Let A!^ be the event that 
the LLR for i-th bit is zero. We consider the events of erasure 
{A'j} rather than {Ai} for simplicity. We first define partial 
ordering on {1, . . . , 2"}. 

Definition 2. For i, j G {1, . . . , 2"}, i ^ j ;/ and only if t-th 
bit of binary expansion of j — 1 is one when t-th bit of binary 
expansion of i — \ is one for any t G {1, . . . , n}. 

The following theorem is useful for reducing time complexity 
of calculations of the bounds. 

Lemma 3. j < i A'^ c A'y 

Proof: If a variable node outputs an erased message, a 
check node with the same input as the variable node outputs 
an erased message. Hence if v G A'^ then v G A'j. The proof 
of the other direction is also obvious and is omitted. ■ 

Tlieorem 3. The block erasure probability of polar codes of 
information bits X is -F'(UieM(i) -^i) where M{T) denotes the 
set of minimal elements of I with respect to -<. 

Proof: From Lemma E] U,ei = UjeA/(j) A- ■ 
From this Theorem, we have only to consider the set of 
minimal elements M{2) for the block erasure probability 
of polar codes over the BEC, which can be used to tighten 
bounds. 

For polar codes over a general symmetric B-MC, the 
following result similar to Theorem [3] is obtained. 

Tlieorem 4. For integers < k < n, and < i < 2"^*^ — 1 

P(A^^+1.2" U • • • U A'=(.+l),2") = 1 - (1 

Proof is omitted for lack of space. Although the joint proba- 
bility -P(^2'«j+i,2" U • • • U ^2'«(i+i),2") can be calculated via 
joint density evolution. Theorem H] allows us to calculate it 
more efficiently via density evolution for depth- (n — k) trees 
and a few arithmetics. 

From Theorem |4] one can efficiently obtain a tighter upper 
bound than (|4|i by decomposing the block error event as 
£ ~ Uiex'^i — UjGJ^J' where each Cj is expressed as 
■^2'=i+i,2" U • • • U^2'»(i+i),2"- For example, if we choose X = 
{4, 6, 7, 8} as information bits for a polar code with = 8, 
one obtains an upper bound P{A4) + P{A(i) + P{A7 U As), 
which is tighter than ^(^^4) + P{Ae) + P{A7) + PiAs). 

VII. Numerical calculations and simulations 

In this section, numerical calculation results are compared 
with numerical simulation results. Figure |2] shows calculation 
results of the upper bounds (IHi, (|6]l and the lower bound Q of 
block erasure probability. Coding rate is 0.5 and blocklength 
is 1024. Only the minimal elements of information bits are 
considered in view of Theorem [3] for calculation of these 
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Fig. 2. Calculation results of upper bounds j4), j6) and lower bound (5). 
Rate is 0.5. Blocklength is 1024. 

bounds. Although we optimized the upper bound ^ and the 
lower bound ^ only approximately, the lower bound is very 
close to the upper bound for e below 0.4. Our new upper bound 
is always smaller than 1 and closer to the simulation results, 
whereas the union bound exceeds 1 when e > 0.407. 

VIII. Conclusion and future works 

The construction method of polar codes for symmetric B- 
MCs with complexity 0{N) is shown. New upper and lower 
bounds for the block error probability of particular polar codes 
and the method of joint density evolution are derived. The 
method and the bounds are also applicable to generalized polar 
codes Q. 

Computing higher-order joint distributions and deriving 
other bounds (e.g., Boole's inequality with higher-order terms) 
are future works. 
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